-
Notifications
You must be signed in to change notification settings - Fork 471
Add membarrier call in Android. #228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add membarrier call in Android. #228
Conversation
membarrier syscall is not available in android / bionic. Android uses cutils internally to execute the membarrier instruction, but is not publicly available in the NDK. This shim takes the implementation from cutils as a reference. https://android.googlesource.com/platform/system/core/+/48b911c573c92742 aa80270b734811f722c67c37/include/cutils/atomic-arm.h
no, sys_membarrier gives stronger guarantees that what you are doing. what you call membarrier is just c11's seq_cst barrier and is not enough for dispatch_once() |
@MadCoder get it, sorry about that. Would Was it replaced due to inaccuracy / inefficiency? Or just for convenience? Thanks for your time. |
atomic_thread_fence() is what I said, it is a c11 barrier which is not enough for dispatch once, it needs something that make sure no other CPU can see old things, which if you don't have more guarantees on your hardware, requires sys_membarrier() which is an IPI on all cores. Alternatively, platforms that don't have a sys_membarrier should remove the inline fastpath and only have a function call with an acquire barrier. |
As I see libdispatch uses Why is it not enough now? |
It was never correct. The barrier required here is one that ensures other cores can't see stale values for the store(s) this dispatch_once did as the read side has no barrier whatsoever. Most hardware has no instruction that can guarantee that, the only way is to IPI all the cores which is what sys_membarrier does. The alternative is to put an acquire barrier on the read side. However this more or less becomes equivalent to a lock and defeats the point of the primitive. |
As I see read side uses something like |
You are incorrect, the read side is inlined from https://github.com/apple/swift-corelibs-libdispatch/blob/master/dispatch/once.h#L99 I think that given how widely dispatch_once is used, the IPI is not great, so the right fix is likely to:
the challenge is that In #if defined(__cplusplus) && __has_header(<atomic>)
#include <atomic>
#define DISPATCH_HAS_C11_ATOMICS 1
#define DISPATCH_ATOMIC(type) std::atomic<type>
#define DISPATCH_ATOMIC_STD(name) std::name
#elif __has_header(<stdatomic.h>)
#include <stdatomic.h>
#define DISPATCH_HAS_C11_ATOMICS 1
#define DISPATCH_ATOMIC(type) _Atomic(type)
#define DISPATCH_ATOMIC_STD(name) name
#else
#define DISPATCH_HAS_C11_ATOMICS 0
#endif And then in once.h:
and make dispatch_once_f this way: #if DISPATCH_HAS_C11_ATOMICS
DISPATCH_INLINE DISPATCH_ALWAYS_INLINE DISPATCH_NONNULL1 DISPATCH_NONNULL3
DISPATCH_NOTHROW
DISPATCH_SWIFT3_UNAVAILABLE("Use lazily initialized globals instead")
void
_dispatch_once_f(dispatch_once_t *predicate, void *_Nullable context,
dispatch_function_t function)
{
DISPATCH_ATOMIC(dispatch_once_t) *atomic_pred =
(DISPATCH_ATOMIC(dispatch_once_t) *)predicate;
dispatch_once_t v = DISPATCH_ATOMIC_STD(atomic_load_explicit)(atomic_pred,
DISPATCH_ATOMIC_STD(DISPATCH_ONCE_READ_BARRIER));
if (DISPATCH_EXPECT(v == ~0l)) return;
dispatch_once_f(predicate, context, function);
DISPATCH_COMPILER_CAN_ASSUME(*predicate == ~0l);
}
#undef dispatch_once_f
#define dispatch_once_f _dispatch_once_f
#endif // DISPATCH_HAS_C11_ATOMICS the same should be done for dispatch_once. However I'm not keen on injecting |
And to be clear the reason why I'm not keen on injecting We may decide that if there's no way to do the fully relaxed loads we would go through the function call always and be done with it (in dispatch we have all we need to write that acquire load), we'd have to make the code of |
Ok thanks, much clearer now. Maybe it is possible just move done check to the beginning of #if defined(__i386__) || defined(__x86_64__) || defined(__s390x__)
#define DISPATCH_ONCE_READ_BARRIER relaxed
#else
#define DISPATCH_ONCE_READ_BARRIER acquire
#endif
DISPATCH_NOINLINE
void
dispatch_once_f(dispatch_once_t *val, void *ctxt, dispatch_function_t func)
{
dispatch_once_t v = os_atomic_load(val, DISPATCH_ONCE_READ_BARRIER);
if (likely(v == ~0l)) {
return;
}
... |
membarrier syscall is not available in android / bionic. Android uses
cutils internally to execute the membarrier instruction, but is not
publicly available in the NDK. This shim takes the implementation from
cutils as a reference.
https://android.googlesource.com/platform/system/core/+/48b911c573c92742
aa80270b734811f722c67c37/include/cutils/atomic-arm.h